AITopics | large-scale recommendation system

Collaborating Authors

large-scale recommendation system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Performance and Scalability of Large-Scale Recommendation Systems with Jagged Flash Attention

Xu, Rengan, Yang, Junjie, Xu, Yifan, Li, Hong, Liu, Xing, Shankar, Devashish, Zhang, Haoci, Liu, Meng, Li, Boyang, Hu, Yuxi, Tang, Mingwei, Zhang, Zehua, Zhang, Tunhou, Li, Dai, Chen, Sijia, Musumeci, Gian-Paolo, Zhai, Jiaqi, Zhu, Bill, Yan, Hong, Reddy, Srihari

arXiv.org Artificial IntelligenceSep-19-2024

The integration of hardware accelerators has significantly advanced the capabilities of modern recommendation systems, enabling the exploration of complex ranking paradigms previously deemed impractical. However, the GPU-based computational costs present substantial challenges. In this paper, we demonstrate our development of an efficiency-driven approach to explore these paradigms, moving beyond traditional reliance on native PyTorch modules. We address the specific challenges posed by ranking models' dependence on categorical features, which vary in length and complicate GPU utilization. We introduce Jagged Feature Interaction Kernels, a novel method designed to extract fine-grained insights from long categorical features through efficient handling of dynamically sized tensors. We further enhance the performance of attention mechanisms by integrating Jagged tensors with Flash Attention. Our novel Jagged Flash Attention achieves up to 9x speedup and 22x memory reduction compared to dense attention. Notably, it also outperforms dense flash attention, with up to 3x speedup and 53% more memory efficiency. In production models, we observe 10% QPS improvement and 18% memory savings, enabling us to scale our recommendation systems with longer features and more complex architectures.

arxiv preprint arxiv, enhancing performance and scalability, large-scale recommendation system, (7 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3640457.3688040

2409.15373

Country:

Europe > Italy > Apulia > Bari (0.06)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

Add feedback

Uncertainty of Joint Neural Contextual Bandit

Guo, Hongbo, Zhu, Zheqing

arXiv.org Artificial IntelligenceJun-4-2024

Contextual bandit learning is increasingly favored in modern large-scale recommendation systems. To better utlize the contextual information and available user or item features, the integration of neural networks have been introduced to enhance contextual bandit learning and has triggered significant interest from both academia and industry. However, a major challenge arises when implementing a disjoint neural contextual bandit solution in large-scale recommendation systems, where each item or user may correspond to a separate bandit arm. The huge number of items to recommend poses a significant hurdle for real world production deployment. This paper focuses on a joint neural contextual bandit solution which serves all recommending items in one single model. The output consists of a predicted reward $\mu$, an uncertainty $\sigma$ and a hyper-parameter $\alpha$ which balances exploitation and exploration, e.g., $\mu + \alpha \sigma$. The tuning of the parameter $\alpha$ is typically heuristic and complex in practice due to its stochastic nature. To address this challenge, we provide both theoretical analysis and experimental findings regarding the uncertainty $\sigma$ of the joint neural contextual bandit model. Our analysis reveals that $\alpha$ demonstrates an approximate square root relationship with the size of the last hidden layer $F$ and inverse square root relationship with the amount of training data $N$, i.e., $\sigma \propto \sqrt{\frac{F}{N}}$. The experiments, conducted with real industrial data, align with the theoretical analysis, help understanding model behaviors and assist the hyper-parameter tuning during both offline training and online deployment.

algorithm, bandit, contextual bandit, (11 more...)

arXiv.org Artificial Intelligence

2406.02515

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Drifter: Efficient Online Feature Monitoring for Improved Data Integrity in Large-Scale Recommendation Systems

Škrlj, Blaž, Ki-Tov, Nir, Edelist, Lee, Silberstein, Natalia, Weisman-Zohar, Hila, Mramor, Blaž, Kopič, Davorin, Ziporin, Naama

arXiv.org Artificial IntelligenceSep-20-2023

Real-world production systems often grapple with maintaining data quality in large-scale, dynamic streams. We introduce Drifter, an efficient and lightweight system for online feature monitoring and verification in recommendation use cases. Drifter addresses limitations of existing methods by delivering agile, responsive, and adaptable data quality monitoring, enabling real-time root cause analysis, drift detection and insights into problematic production events. Integrating state-of-the-art online feature ranking for sparse data and anomaly detection ideas, Drifter is highly scalable and resource-efficient, requiring only two threads and less than a gigabyte of RAM per production deployments that handle millions of instances per minute. Evaluation on real-world data sets demonstrates Drifter's effectiveness in alerting and mitigating data quality issues, substantially improving reliability and performance of real-time live recommender systems.

efficient online feature monitoring, improved data integrity, large-scale recommendation system, (1 more...)

arXiv.org Artificial Intelligence

2309.08617

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.53)

Add feedback